AITopics | canonical correlation

Collaborating Authors

canonical correlation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Insights on representational similarity in neural networks with canonical correlation

Neural Information Processing SystemsNov-20-2025, 22:46:46 GMT

Comparing different neural network representations and determining how representations evolve over time remain challenging open questions in our understanding of the function of neural networks. Comparing representations in neural networks is fundamentally difficult as the structure of representations varies greatly, even across groups of networks trained on identical tasks, and over the course of training. Here, we develop projection weighted CCA (Canonical Correlation Analysis) as a tool for understanding neural networks, building off of SVCCA, a recently proposed method (Raghu et al, 2017). We first improve the core method, showing how to differentiate between signal and noise, and then apply this technique to compare across a group of CNNs, demonstrating that networks which generalize converge to more similar representations than networks which memorize, that wider networks converge to more similar solutions than narrow networks, and that trained networks with identical topology but different learning rates converge to distinct clusters with diverse representations. We also investigate the representational dynamics of RNNs, across both training and sequential timesteps, finding that RNNs converge in a bottom-up pattern over the course of training and that the hidden state is highly variable over the course of a sequence, even when accounting for linear transforms. Together, these results provide new insights into the function of CNNs and RNNs, and demonstrate the utility of using CCA to understand representations.

name change, neural network, representational similarity, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Measuring Teaching with LLMs

Hardy, Michael

arXiv.org Artificial IntelligenceNov-7-2025

Objective and scalable measurement of teaching quality is a persistent challenge in education. While Large Language Models (LLMs) offer potential, general-purpose models have struggled to reliably apply complex, authentic classroom observation instruments. This paper uses custom LLMs built on sentence-level embeddings, an architecture better suited for the long-form, interpretive nature of classroom transcripts than conventional subword tokenization. We systematically evaluate five different sentence embeddings under a data-efficient training regime designed to prevent overfitting. Our results demonstrate that these specialized models can achieve human-level and even super-human performance with expert human ratings above 0.65 and surpassing the average human-human rater correlation. Further, through analysis of annotation context windows, we find that more advanced models-those better aligned with human judgments-attribute a larger share of score variation to lesson-level features rather than isolated utterances, challenging the sufficiency of single-turn annotation paradigms. Finally, to assess external validity, we find that aggregate model scores align with teacher value-added measures, indicating they are capturing features relevant to student learning. However, this trend does not hold at the individual item level, suggesting that while the models learn useful signals, they have not yet achieved full generalization. This work establishes a viable and powerful new methodology for AI-driven instructional measurement, offering a path toward providing scalable, reliable, and valid feedback for educator development.

correlation, large language model, natural language, (17 more...)

arXiv.org Artificial Intelligence

2510.22968

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Genre:

Research Report > New Finding (1.00)
Instructional Material (1.00)

Industry: Education > Educational Setting (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Graph Canonical Correlation Analysis

Park, Hongju, Bai, Shuyang, Ye, Zhenyao, Lee, Hwiyoung, Ma, Tianzhou, Chen, Shuo

arXiv.org Machine LearningFeb-3-2025

CCA considers the following maximization problem: max a,b(a X Y b) subject to a X X a 1 and b Y Y b 1, where the vectors a and b and the correlation are said to be canonical vectors and canonical correlation if they attain the above maximization. In the classical canonical correlation analysis, the canonical vectors a and b include nonzero loadings for all X and Y variables. However, in a high-dimensional setting with p, q n, the goal is to identify which subsets of X are associated with subsets Y and estimate the measure of associations, as the canonical correlation with the full dataset is overly high due to estimation bias caused by overfitting. To ensure the sparsity, shrinkage methods 4 Biometrics, 000 0000 are commonly used. For example, Witten et al. (2009) propose sparse canonical correlation analysis (sCCA). The criterion of sCCA can be in general expressed as follows: max a,b a X Y b subject to a X X a 1, b Y Y b 1, P 1( a) k 1, P 2( b) k 2, where P 1 and P 2 are convex penalty functions for penalization for a and b with positive constants k 1 and k 2, respectively. A representative penalty function is a ℓ 1 penalty function such that P 1(a) = a 1 and P 2(b) = b 1. sCCA imposes zero loadings in canonical vectors and thus only selects subsets of correlated X and Y . However, sCCA methods may neither fully recover correlated X and Y pairs nor capture the multivariate-to-multivariate linkage patterns (see Figure 3) because the ℓ 1 shrinkage tends to select only a small subset from the associated variables of X and Y .

artificial intelligence, correlation, machine learning, (14 more...)

arXiv.org Machine Learning

2502.0178

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.46)

Add feedback

A Shape-Based Functional Index for Objective Assessment of Pediatric Motor Function

Kumar, Shashwat, Rahman, Arafat, Gutierrez, Robert, Livermon, Sarah, McCrady, Allison N., Blemker, Silvia, Scharf, Rebecca, Srivastava, Anuj, Barnes, Laura E.

arXiv.org Artificial IntelligenceJan-2-2025

Clinical assessments for neuromuscular disorders, such as Spinal Muscular Atrophy (SMA) and Duchenne Muscular Dystrophy (DMD), continue to rely on subjective measures to monitor treatment response and disease progression. We introduce a novel method using wearable sensors to objectively assess motor function during daily activities in 19 patients with DMD, 9 with SMA, and 13 age-matched controls. Pediatric movement data is complex due to confounding factors such as limb length variations in growing children and variability in movement speed. Our approach uses Shape-based Principal Component Analysis to align movement trajectories and identify distinct kinematic patterns, including variations in motion speed and asymmetry. Both DMD and SMA cohorts have individuals with motor function on par with healthy controls. Notably, patients with SMA showed greater activation of the motion asymmetry pattern. We further combined projections on these principal components with partial least squares (PLS) to identify a covariation mode with a canonical correlation of r = 0.78 (95% CI: [0.34, 0.94]) with muscle fat infiltration, the Brooke score (a motor function score), and age-related degenerative changes, proposing a novel motor function index. This data-driven method can be deployed in home settings, enabling better longitudinal tracking of treatment efficacy for children with neuromuscular disorders.

artificial intelligence, correlation, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2501.04721

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Virginia (0.05)
North America > United States > New York > New York County > New York City (0.04)
(3 more...)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Government > Regional Government > North America Government > United States Government > FDA (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.35)

Add feedback

Canonical Correlation Guided Deep Neural Network

Chen, Zhiwen, Mo, Siwen, Ke, Haobin, Ding, Steven X., Jiang, Zhaohui, Yang, Chunhua, Gui, Weihua

arXiv.org Artificial IntelligenceSep-28-2024

Learning representations of two views of data such that the resulting representations are highly linearly correlated is appealing in machine learning. In this paper, we present a canonical correlation guided learning framework, which allows to be realized by deep neural networks (CCDNN), to learn such a correlated representation. It is also a novel merging of multivariate analysis (MVA) and machine learning, which can be viewed as transforming MVA into end-to-end architectures with the aid of neural networks. Unlike the linear canonical correlation analysis (CCA), kernel CCA and deep CCA, in the proposed method, the optimization formulation is not restricted to maximize correlation, instead we make canonical correlation as a constraint, which preserves the correlated representation learning ability and focuses more on the engineering tasks endowed by optimization formulation, such as reconstruction, classification and prediction. Furthermore, to reduce the redundancy induced by correlation, a redundancy filter is designed. We illustrate the performance of CCDNN on various tasks. In experiments on MNIST dataset, the results show that CCDNN has better reconstruction performance in terms of mean squared error and mean absolute error than DCCA and DCCAE. Also, we present the application of the proposed network to industrial fault diagnosis and remaining useful life cases for the classification and prediction tasks accordingly. The proposed method demonstrates superior performance in both tasks when compared to existing methods. Extension of CCDNN to much more deeper with the aid of residual connection is also presented in appendix.

artificial intelligence, correlation, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2409.19396

Country:

Asia > China > Hong Kong (0.04)
North America > United States > New York (0.04)
North America > United States > Colorado > Denver County > Denver (0.04)
(2 more...)

Genre: Research Report > New Finding (0.34)

Industry: Health & Medicine (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Add feedback

A prototype-based model for set classification

Mohammadi, Mohammad, Ghosh, Sreejita

arXiv.org Artificial IntelligenceAug-25-2024

Classification of sets of inputs (e.g., images and texts) is an active area of research within both computer vision (CV) and natural language processing (NLP). A common way to represent a set of vectors is to model them as linear subspaces. In this contribution, we present a prototype-based approach for learning on the manifold formed from such linear subspaces, the Grassmann manifold. Our proposed method learns a set of subspace prototypes capturing the representative characteristics of classes and a set of relevance factors automating the selection of the dimensionality of the subspaces. This leads to a transparent classifier model which presents the computed impact of each input vector on its decision. Through experiments on benchmark image and text datasets, we have demonstrated the efficiency of our proposed classifier, compared to the transformer-based models in terms of not only performance and explainability but also computational resource requirements.

classification, dataset, prototype, (15 more...)

arXiv.org Artificial Intelligence

2408.1372

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

The Randomized Dependence Coefficient

Neural Information Processing SystemsMar-13-2024, 19:35:45 GMT

We introduce the Randomized Dependence Coefficient (RDC), a measure of nonlinear dependence between random variables of arbitrary dimension based on the Hirschfeld-Gebelein-Rényi Maximum Correlation Coefficient. RDC is defined in terms of correlation of random non-linear copula projections; it is invariant with respect to marginal distribution transformations, has low computational cost and is easy to implement: just five lines of R code, included at the end of the paper.

coefficient, dependence measure, randomized dependence coefficient, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
Asia > Middle East > Jordan (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Add feedback

Efficient Algorithms for the CCA Family: Unconstrained Objectives with Unbiased Gradients

Chapman, James, Wells, Lennie, Aguila, Ana Lawry

arXiv.org Machine LearningNov-21-2023

The Canonical Correlation Analysis (CCA) family of methods is foundational in multi-view learning. Regularised linear CCA methods can be seen to generalise Partial Least Squares (PLS) and be unified with a Generalized Eigenvalue Problem (GEP) framework. However, classical algorithms for these linear methods are computationally infeasible for large-scale data. Extensions to Deep CCA show great promise, but current training procedures are slow and complicated. First we propose a novel unconstrained objective that characterizes the top subspace of GEPs. Our core contribution is a family of fast algorithms for stochastic PLS, stochastic CCA, and Deep CCA, simply obtained by applying stochastic gradient descent (SGD) to the corresponding CCA objectives. These methods show far faster convergence and recover higher correlations than the previous state-of-the-art on all standard CCA and Deep CCA benchmarks. This speed allows us to perform a first-of-its-kind PLS analysis of an extremely large biomedical dataset from the UK Biobank, with over 33,000 individuals and 500,000 variants. Finally, we not only match the performance of `CCA-family' Self-Supervised Learning (SSL) methods on CIFAR-10 and CIFAR-100 with minimal hyper-parameter tuning, but also establish the first solid theoretical links to classical CCA, laying the groundwork for future insights.

artificial intelligence, machine learning, matrix, (18 more...)

arXiv.org Machine Learning

2310.01012

Country:

South America > Uruguay > Artigas > Artigas (0.04)
Europe > San Marino > Fiorentino > Fiorentino (0.04)
Europe > Netherlands > Drenthe > Assen (0.04)
(5 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.54)

Add feedback

Inferring independent sets of Gaussian variables after thresholding correlations

Saha, Arkajyoti, Witten, Daniela, Bien, Jacob

arXiv.org Machine LearningNov-2-2022

We consider testing whether a set of Gaussian variables, selected from the data, is independent of the remaining variables. We assume that this set is selected via a very simple approach that is commonly used across scientific disciplines: we select a set of variables for which the correlation with all variables outside the set falls below some threshold. Unlike other settings in selective inference, failure to account for the selection step leads, in this setting, to excessively conservative (as opposed to anti-conservative) results. Our proposed test properly accounts for the fact that the set of variables is selected from the data, and thus is not overly conservative. To develop our test, we condition on the event that the selection resulted in the set of variables in question. To achieve computational tractability, we develop a new characterization of the conditioning event in terms of the canonical correlation between the groups of random variables. In simulation studies and in the analysis of gene co-expression networks, we show that our approach has much higher power than a ``naive'' approach that ignores the effect of selection.

artificial intelligence, machine learning, matrix, (18 more...)

arXiv.org Machine Learning

2211.01521

Country:

North America > United States > California (0.14)
Asia > Middle East > Israel (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Day 176(Computer Vision) -- Age-Invariant Face Recognition

#artificialintelligenceSep-2-2021, 10:40:09 GMT

Feature Factorization -- A linear factorization module is introduced that decomposes the entire set of facial features into two sets of uncorrelated components(age & identity). This is followed by retrieving age-related details through a mapping function'R' and the residual part is considered as the identity component. During the inference time, only the identity-related features are utilised for face recognition. The first backbone network is similar to ResNets which extracts the initial features from the entire image. Decorrelated Adversarial Learning -- Even though we want both the components to be independent of each other, practically identity has some mix of features from the age information.

adversarial learning, age-invariant face recognition, correlation, (7 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Vision > Face Recognition (0.96)

Add feedback